Speaker Diarization Using Unsupervised Discriminant Analysis of Inter-channel Delay Feature

نویسندگان

Nicholas W. D. Evans

Corinne Fredouille

Jean-François Bonastre

چکیده

When multiple microphones are available estimates of inter-channel delay, which characterise a speaker’s location, can be used as features for speaker diarization. Background noise and reverberation can, however, lead to noisy features and poor performance. To ameliorate these problems, this paper presents a new approach to the discriminant analysis of delay features for speaker diarization. This novel and nonetheless unsupervised approach aims to increase speaker separability in delay-space. We assess the approach on subsets of four standard NIST RT datasets and demonstrate a relative improvement in diarization error rate of 25% on a separate evaluation set using delay features alone.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker diarization of spontaneous meeting room conversations

Speaker diarization is the task of identifying “who spoke when” in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization systems have isolated three main issues with the systems; overlapping speech, effects of background noise and speech/nonspeech detection errors on...

متن کامل

ALIZE/spkdet: a state-of-the-art open source software for speaker recognition

This paper presents the ALIZE/SpkDet open source software packages for text independent speaker recognition. This software is based on the well-known UBM/GMM approach. It includes also the latest speaker recognition developments such as Latent Factor Analysis (LFA) and unsupervised adaptation. Discriminant classifiers such as SVM supervectors are also provided, linked with the Nuisance Attribut...

متن کامل

Domain Adaptation of PLDA Models in Broadcast Diarization by Means of Unsupervised Speaker Clustering

This work presents a new strategy to perform diarization dealing with high variability data, such as multimedia information in broadcast. This variability is highly noticeable among domains (inter-domain variability among chapters, shows, genres, etc.). Therefore, each domain requires its own specific model to obtain the optimal results. We propose to adapt the PLDA models of our diarization sy...

متن کامل

Speaker Diarization Using Gaussian Mixture Turns and Segment Matching

Speaker diarization aims to detect “who spoke when” in large audio segments. It is an important task in processing of broadcast news audio, making easier the audio segments selection and indexing task. In this paper an unsupervised speaker diarization scheme is proposed using a Gaussian Mixture Model as a Universal Background Model, Bayesian Information Criterion and fingerprint detection. A de...

متن کامل

Speaker Diarization Based on Gmm Supervectors and Unsupervised Intra-speaker Variability Modeling

This paper presents a novel framework for speaker diarization. Audio is parameterized by a sequence of GMM-supervectors representing overlapping short segments of speech. Session dependent intra-session intra-speaker variability is estimated online in an unsupervised manner, and is removed from the supervectors using Nuisance Attribute Projection (NAP) The supervectors are then projected using ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Speaker Diarization Using Unsupervised Discriminant Analysis of Inter-channel Delay Feature

نویسندگان

چکیده

منابع مشابه

Speaker diarization of spontaneous meeting room conversations

ALIZE/spkdet: a state-of-the-art open source software for speaker recognition

Domain Adaptation of PLDA Models in Broadcast Diarization by Means of Unsupervised Speaker Clustering

Speaker Diarization Using Gaussian Mixture Turns and Segment Matching

Speaker Diarization Based on Gmm Supervectors and Unsupervised Intra-speaker Variability Modeling

عنوان ژورنال:

اشتراک گذاری